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Summary 


This chapter would like to provide a short survey of the most promising concepts applied 
recently in analysis of glycoproteins based on lectins. The first part describes the most 
exciting analytical approaches used in the field of glycoprofiling based on integration of 
nanoparticles, nanowires, nanotubes or nanochannels or using novel transducing platforms 
allowing to detect very low levels of glycoproteins in a label-free mode of operation. The 
second part describes application of recombinant lectins containing several tags applied for 
oriented and ordered immobilisation of lectins. Besides already established concepts of 
glycoprofiling several novel aspects, which we think will be taken into account for future, 
more robust glycan analysis are described including modified lectins, peptide lectin aptamers 
and DNA aptamers with lectin-like specificity introduced by modified nucleotides. The last 
part of the chapter describes a novel concept of a glycocodon, what can lead to a better 
understanding of glycan-lectin interaction and for design of novel lectins with unknown 
specificities and/or better affinities toward glycan target or for rational design of peptide lectin 
aptamers or DNA aptamers. 


Keywords: biosensors, glycomics, lectins, nanoparticles, DNA aptamers, lectin peptide 
aptamers, recombinant lectins 
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1. Introduction 


Since the introduction of DNA biochips in 1995 (1), the technology has been intensively 
applied in assays of genome-wide expression to seek information about possible functions of 
novel or poorly characterised genes (2) and for diagnostic purposes, as well (3). Even though 
DNA microarray technology has shed light on many physiological functions of genes by 
determination of expression of gene clusters, there is quite often only a very low correlation 
between RNA and protein abundance detected in single-cell organisms (4) and in higher 
ones, including humans (5). Since quantitative analysis of proteins is central to proteomics 
with afocus on design of novel drugs, diagnostics of diseases and their therapeutic 
applications, protein microarrays were successfully launched to address these issues (6) . 


Analysis of finely tuned post-translational modifications (PTMs) of proteins is an additional 
challenge for current analytical technology. Glycosylation is a highly abundant form of PTM 
of proteins and it is estimated that 70%—80% of human proteins are glycosylated (7). 
Importance of glycans can be further highlighted by the fact that 70% of all therapeutic 
proteins are glycosylated (8). Glycan mediated recognition plays an important role in many 
different cell's processes such as fertilisation, immune response, differentiation of cells, cell- 
matrix interaction, cell-cell adhesion etc. (9, 10). Glycans present on the surface of cells are 
naturally involved in pathological processes including viral and bacterial infections, in 
neurological disorder and in tumour growth and metastasis (3, 11—16). Thus, better 
understanding of glycan mediated pathogenesis is essential in order to establish a “policy” to 
develop efficient routes for disease treatment with several recent studies as good examples 
e.g. “neutralisation” of various forms of viruses (17,178) or more efficient vaccines against 
various diseases (19, 20). A changed glycosylation on a protein backbone can be effectively 
applied in early stage diagnostics of several diseases, including different forms of cancer with 
known glycan-based biomarkers (21—23). Moreover, many previously established and even 
commercially successful strategies used to treat diseases are currently being revisited in light 
of glycan recognition in order to lower side effects, enhance serum half-life or to decrease 
cellular toxicity (3, 24, 25). Recently, the first glyco-engineered antibody was approved to the 
market, what was called by the authors “a triumph for glyco-engineering” (26). 


Glycomics focuses on revealing finely tuned reading mechanisms in the cell orchestra 
based on graded affinity, avidity and multivalency of glycans (i.e. sugar chains covalently 
attached to proteins and lipids) (27). Glycans are information-rich molecules applicable in 
coding tools of the cell since they can form enormous number of possible unique sequences 
from basic building units (28). It is estimated that the size of the cellular glycome can be up to 
500,000 glycan modified biomolecules (proteins and lipids) formed from 7,000 unique glycan 
sequences (29). Thus, it is not a surprise the glycome is sometimes referred to as the “third 
alphabet” in biology, after genetics and proteomics (30). A huge glycan variation can explain 
human complexity in light of a paradoxically small genome. This glycan complexity together 
with similar physico-chemical properties of glycans is the main reason why the progress in 
the field of glycomics has been behind advances in genomics and proteomics (37). 
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Traditional glycoprofiling protocols rely on glycan release from a biomolecule with a 
subsequent quantification by an array of techniques including capillary electrophoresis, liquid 
chromatography and mass spectrometry (32-34, 30, 35-37). There is an alternative way for 
glycoprofiling by application of lectins, natural glycan recognising proteins (28, 38, 39) in 
combination with various transducing protocols (12, 40, 41). The most powerful glycoprofiling 
tool relies on lectins arrayed on solid surfaces for direct analysis of glycoproteins, glycolipids, 
membranes and even glycans on the surface of intact cells (11, 42, 12, 43). Even though 
lectin microarrays offer high throughput assay protocols with a minute consumption of 
samples and reagents, there are some drawbacks such as a need to fluorescently label the 
sample or the lectin, which negatively affects the performance of detection (11, 12), relatively 
high detection limits and quite narrow working concentration ranges. Thus, the ideal 
detection platform should be based on protocols without a need to label a glycoprotein or a 
lectin, in a way similar to natural processes occurring within a cell (30). 


Lectins (lat. legere = to choose) are proteins able to recognise and reversibly bind to free 
or bound mono- and oligosaccharides (44). They are not usually catalytically active, do not 
participate in the immune response of higher organisms and can be found in viruses, 
bacteria, fungi, plants and animals. They are therefore a relatively heterogeneous group of 
oligomeric proteins belonging to distinct families with similar sequences and are considered 
as natural glycocode decipherers (28). Lectins, unlike antibodies, have a low specificity and 
affinity with Ka ranged from 10° to 107 M (30) and lectins with a new specificity cannot be 
raised in a way similar to antibodies. 


In the following sections we will focus on ways how to improve glycan detection either by 
application of novel nanoscale-controlled patterning protocols, nanoengineered devices or by 
application of novel, recombinant lectins, lectin-like aptamers or lectin peptide aptamers. A 
final part of this book chapter will focus on a completely novel area in the glycomics — the 
idea of a glycocodon. 


2. Perspectives of Novel Formats of Analysis Applicable in Glycomics 


The use of nanotechnology, sophisticated nanoscale patterning protocols and advanced 
detection platforms can help to overcome the drawbacks of lectin microarray technology 
allowing it to work in a label-free mode of operation, with a high sensitivity, low detection 
limits, a wide concentration window and in some cases, real time analysis of a binding event 
is possible (9, 45-47, 30, 48-51). These devices can differ in their mode of signal 
transduction compared to traditional methods and will be divided into three categories 
according to their mode of action. Various traditional analytical techniques based on lectins 
(i.e. surface plasmon resonance and quartz crystal microbalance) are covered by different 
chapters accompanied this one within this book and are not discussed here. 
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2.1. Mechanical Platforms 


Microcantilever biochips offer a novel approach for detection of a molecular binding based 
on a change in mass accumulated on the surface of a cantilever during biorecognition. It is a 
label-free technique allowing to monitor biospecific interaction in a real time, thus, affinity 
constants of the interaction can be acquired. When a biorecognition takes place, a particular 
cantilever bends, what results in the shift of a laser beam angle, allowing for a direct 
detection of the binding event (Fig. 1A). 


Fig. 1 near here 


The device was prepared with variations in the density and composition of a glycan 
determinant immobilised via a thiol-gold surface chemistry on a cantilever surface. Namely 
galactose, trimannose and nonamannose were attached on the surface and probed with two 
different lectins — cyanovirin A and Concanavalin A (Con A). The later was successfully 
detected on a surface with optimal glycan composition down to a nanomolar range (52). The 
sensitivity is not impressive, but comparable to traditional surface plasmon resonance and 
quartz crystal microbalance lectin-based biosensors. The Seeberger’s group later extended 
this concept for analysis of several Escherichia coli strains on microcantilever biochips 
functionalised with different mannosides with a specific and reproducible detection with an 
amount of detectable E. coli cells over four orders of magnitude (53). 


2.2. Electrical Platforms 


Electrical/electrochemical detection is quite often utilised in combination with other 
techniques in the field of glycomics for some time (54). Electrical platforms of detection of a 
biorecognition event are primarily based on changes in the electrical signals such as 
resistance, impedance, capacitance, conductance, potential, and current (55). These 
analytical techniques are usually non-destructive, extremely sensitive, offering quite a wide 
concentration working range with a possibility to work in an array format of analysis (30). 


2.2.1.Electrochemical impedance spectroscopy (EIS) 


The most frequently used label-free electrochemical technique is EIS, which is based on 
an electric perturbation of a thin layer on the conductive surface by small alternating current 
amplitude with ability to provide characteristics of this interface utilisable in sensing. EIS 
results are typically transformed into a complex plane Nyquist plot vectors, which by 
application of an equivalent circuit can provide information about electron transfer resistance 
of a soluble redox probe in a direct way (Fig. 1B). When a biorecognition took place, an 
electrode interface is modified and a subtle change in interfacial layer characteristics can be 
used for detection. EIS investigation is most frequently performed in the presence of a redox 
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probe with detection of a change of resistance of the interface used for a signal generation. 
EIS is extensively used as a non-destructive technique for reliable analysis of surface 
conditions and allows complex biorecognition events to be probed in a simple, sensitive and 
label-free manner and is being increasingly popular to develop electrochemical lectin-based 
biosensors for glycan determination (9, 30). 


Initial efforts to detect glycoproteins by EIS were launched by the group of Prof. Joshi with 
sialic acid binding Sambucus nigra agglutinin (SNA) and a galactose binding peanut 
agglutinin covalently immobilised on printed circuit board electrodes (56). The assays were 
really quick with a response time of 80 s and with sensitivity of glycoprotein detection down 
to 10 pg mL" (e.g. 150 fM), while using a cost-effective electrode material (56). A group of 
Prof. Oliveira put a substantial effort to use lectin modified surfaces with EIS detection for 
discrimination between healthy human samples and samples from patients infected by a 
mosquito-borne Dengue virus (breakbone fever) with a high mortality rate (9). Their device 
with two different lectins immobilised on gold nanoparticles offered a detection limit in the low 
nM range (57). Another ElS-based biosensor was built on a surface of the silicon chip with 
an array of gold electrodes interfaced with nanoporous alumina membrane with high density 
of nanowells (58). The biosensor offered a high reliability of assays and a good agreement 
with enzyme-linked lectin assays (ELLA). The detection limit of a biosensor for its analyte 
was 5 orders of magnitude lower compared to ELLA (i.e. 20 fM vs. 4.6 nM). An assay time 
for the biosensor of 15 min was much shorter compared to 4 h needed for ELLA. Moreover, a 
minute amount of sample (10 uL) was sufficient for the analysis by the biosensor (58). 


In our recent work we focused on the development of ultrasensitive impedimetric lectin 
biosensors with detection limits down to a single-molecule level based on controlled 
architecture at the nanoscale (59-61). In the first study the biosensor was able to detect a 
glycoprotein in a concentration window spanning 7 orders of magnitude with a detection limit 
for the glycoprotein down to 0.3 fM, what was the lowest glycoprotein concentration detected 
(59). In the following study an incorporation of gold nanoparticles offered even lower and 
unprecedented detection limit of 0.5 aM with quite a wide dynamic concentration range 
covered (67). In our last study the ElS-biosensors were constructed with three different 
lectins to be able to detect changes on immunoglobulins with progression of a rheumatoid 
arthritis in humans. The biosensor with improved antifouling properties offered a detection 
limit in the fM range and worked properly even with 1,000x diluted human plasma. The 
biosensor performance was directly compared to the state-of-the-art glycoprofiling tool based 
on fluorescent lectin microarrays with a detection limit in the nM level (60). Moreover, a 
sandwich configuration offered a detection limit down to aM concentration (60). A detection 
limit down to fM range for analysis of alpha-fetoprotein (a biomarker for hepatocellular 
carcinoma) was recently observed on a device modified by arrays of single-walled carbon 
nanotubes and wheat-germ agglutinin with EIS as a transducing mechanism (62). 
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2.2.2.Nanotube field effect transistor (NTFET) sensors 


In NTFETs, semiconducting nanotubes or nanowires act as a channel between two metal 
electrodes (source and drain) while the two electrodes are held at a constant bias voltage 
using a so called gate electrode (Fig. 2A) (49, 30). When the device with an immobilised 
biorecognition element is exposed to the sample containing its binding partner, a change of 
the device conductivity can be applied for quantification of the analyte. The application of the 
FET devices in the field of glycobiology was pioneered by Star’s group (63, 64). In the initial 
study carbon nanotubes were employed as a channel when glycoconjugate was immobilised 
on a surface of the device and an analyte lectin down to 2 nM concentration could be 
detected (63). A forthcoming study confirmed that a carbon nanotube biosensor for detection 
of a lectin outperformed a device based on graphene (64). However, semiconducting carbon 
nanotubes with a high purity are required to achieve better signal quality as a further 
research goal. Silicon nanowires were applied as a FET channel more effectively compared 
to carbon nanotubes and graphene, since a detection limit for a lectin down to 100 fg mL" (= 
fM level) was achieved (65). Even though such a remarkable concentration of lectin with a 
glycan modified FET device was detected, analysis of glycoproteins on a lectin immobilised 
surface can be more problematic since the device is able to detect changes in a close 
proximity to the surface and a biorecognition lectin-glycoprotein can be too far from the 
surface to be detected. The solution, however, can lie in an application of ultra-diluted (100x 
or 1,000x) phosphate buffers allowing to detect biorecognition event at distances 7.5—23.9 
nm from the surface, but for analysis of protein levels in serum, serum has to be desalted 
prior detection (66). 


Fig. 2 near here 


Another interesting approach applied in glycoassays was based on immobilisation of 
mannose inside a nanochannel and changes in the nanochannel conductance were after 
binding of Con A detected in a concentration window from 10 nM to 1,500 nM (67). The 
question how sensitive analysis of glycoproteins with lectins immobilised within a 
nanochannel can be, has to be still answered. 


2.3. Optical Platforms 


There are two different optical sensing mechanisms applied in label-free glycoanalysis. 
The first is based on an intrinsic fluorescence of carbon nanotubes and the second one ona 
localised surface plasmon resonance detected on gold nanoislands. Both concepts have 
advantage since there is no necessity for an electronic interfacing, what is a problematic 
aspect of FET devices, and the nanoscale sensors require only a minute amount of sample 
for analysis. 
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2.3.1. Quenching of an intrinsic carbon nanotube fluorescence 


This platform of detection employs fluorescent carbon nanotubes with a flexible NTA- 
nickel tether attached, modulating fluorescent intensity of carbon nanotubes on one side and 
being applied as a coupling agent for Hisg-tagged lectins. When the glycoprotein interacts 
with an immobilised lectin, a nickel ion moves away from the carbon nanotube surface, 
partially restoring a quenched fluorescence of carbon nanotubes (Fig. 2B). An increase in 
the fluorescence output can be applied not only for quantification of a glycoprotein level, but 
for monitoring of the interaction in a real time, providing kinetic and affinity constants, as well. 
The absolute detection limit of the device for the glycoprotein was not that impressive (2 ug 
i.e. 670 nM), but authors believe the device has a room for improvement (i.e. by using high 
quality nanotube sensors) (47, 68). In a recent study authors extended this initial study for 
glycoprofiling of different forms of IgGs (69). 


2.3.2.A localised surface plasmon resonance 


Noble metal nanostructures exhibiting a localised surface plasmon resonance, sensitive to 
changes in the refractive index near the nanostructures, can be integrated into a biosensor 
device. There is only one report on application of such a device in glycoassays (Fig. 3). In 
this case mannose was immobilised on the surface of Au nanoislands and sensitivity towards 
a lectin was probed under stationary or flow conditions. Moreover, kinetic parameters of 
lectin interaction were obtained in an agreement with traditional techniques. Mannose-coated 
transducers offered an excellent selectivity towards Con A down to concentration of 5 nM in 
the presence of a large excess of bovine serum albumin (BSA) (70). 


Fig. 3 near here 


In summary, it can be concluded that from all novel nanoengineered devices offering 
label-free mode of detection ElS-based biosensors have a great potential for glycan analysis 
since they can clearly outperform the state-of-the-art tool in a glycoprofiling, lectin 
microarrays, in terms of a detection limit achieved and a dynamic concentration range of 
analysis offered. EIS lectin biosensors were successfully applied in analysis of complex 
samples such as human serum even at dilution of 1,000x. Only such sensitive devices can 
really detect ultralow concentration of disease markers directly in human serum, a feature 
important for early stage prognosis of a particular disease. Moreover, biosensor devices with 
a detection limit down to single molecule level have a potential to be applied for identification 
of novel biomarkers, which can be present in human body liquids at concentrations not 
detectable by other analytical platforms of detection involving lectins. Other analytical tools 
based on mechanical, FET and optical signal transduction mechanisms have to prove their 
analytical potential in glycoprofiling with lectins immobilised on surfaces of such devices. 
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3. Perspectives in Lectin Engineering 


The glycan binding sites of lectins are usually a shallow groove or a pocket present at the 
protein surface, or at the interface of oligomers (7). Four main amino acids are part of an 
affinity site including asparagine, aspartic acid, glycine (arginine in Con A) and an aromatic 
residue for interaction with glycan via hydrogen bonds and hydrophobic interactions (77). 
lonic interactions are especially involved in recognition of negatively charged glycans 
containing sialic acids. Lectin-monosaccharide binding is relatively weak, this is why several 
approaches were applied to enhance practical utility of lectins in glycoprofiling (7). 


Recombinant DNA technology for producing lectins was traditionally applied to establish 
primary structure; to study genetics, evolution and biosynthesis; to elucidate the role of 
amino acids in recognition; to produce lectins with altered specificity and/or affinity; and to 
study their function in the organism of origin (72). Novel trend is to apply this technology for 
producing lectins to be utilised for construction of various lectin-based biodevices. 
Recombinant lectin technology can significantly reduce drawbacks of traditional lectin 
isolation such as a long processing time, often quite a low yield, batch-to-batch variation of 
the product quality depending on the source, with presence of various contaminants or 
different lectin isoforms (73, 11). Moreover, recombinant technology offers to produce lectins 
either without any glycosylation, which can in many cases complicate glycoprofiling, by 
expression in prokaryotic hosts and to introduce various tags (HiS¢s-tag, glutathione 
transferase), which can be effectively utilised not only for one-step purification process, but 
more importantly for an oriented immobilisation of lectins on various surfaces (38). Although 
lectin peptide aptamers have not been produced yet, it is a question of time, when such 
artificial glycan binding proteins emerge as an efficient tool in the area of glycobiology. It is 
estimated that another player in the area of glycoprofiling will make a substantial fingerprint 
lectin aptamers based on expanded genetic alphabet by introduction of modified nucleotides. 
Other concepts based on modified lectins in glycoprofiling with added value are finally 
described. 


3.1. Oriented Immobilisation of Recombinant Lectins 


A controlled immobilisation of lectins on a diverse range of surfaces can have a 
detrimental effect on the sensitivity of assays, since lectins can be attached in a way a 
biorecognition site is directly exposed to the solution phase for an efficient biorecognition. As 
a result almost 100% of immobilised lectin molecules can have a proper orientation with an 
increased chance for catching its analyte (Fig. 4A). Moreover a presence of a linker, which 
attaches tag to the protein backbone can significantly lower possible interaction of the protein 
with the surface, what can eventually lead to a denaturation of a protein (74). 


Fig. 4 near here 
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In a pilot study seven bacterial lectins, having a Hisg-GST tag, were expressed in E. coli 
and subsequently applied for construction of a complete recombinant lectin microarray, 
which was utilised to probe differences between several tumour cells (ACHN, TK10, SK- 
MEL-5 and M14 cancer cell lines) (75). For that purpose isolated membrane micelle from the 
tumour cell lines were employed. Such a procedure avoids use of proteases, which can 
change composition of samples containing glycoproteins, and at the same time there is no 
need to work with whole cells, allowing to work with a small spot sizes. The results showed 
distinct variations between tumour cell lines expressing different glycan moieties. In order to 
have a control spot on a lectin microarray a mutated form of one lectin was introduced, 
allowing to quantify specificity of interaction. Moreover, it was found out that in the presence 
of monosaccharides during a lectin printing process better resolved spot morphology and 
lectin activity was achieved (75). In a next study of the Mahal’s group, the effect of oriented 
immobilisation of lectins on the sensitivity of lectin microarray assays was quantified. 
Oriented immobilisation of lectins offered a detection limit of approx. 12 ng mL (ca. 640 pM 
protein) (76), a significantly lower level compared to a detection limit of 10 ug mL™' achieved 
on a lectin microarray with a random immobilisation (33). In a next study the group 
developed oriented immobilisation of recombinant lectins in a single step deposition of lectins 
together with a glutathione to an activated chip surface. Such an approach simplifies an 
overall immobilisation process because the surface does not need to be modified by 
glutathione prior lectin immobilisation (77). 


Another group developed an oriented immobilisation of recombinant lectins produced with 
a fused Fc-fragment. Such a fragment has an affinity towards protein G (expressed in 
Streptococcus sp., much like a protein A) or a carbohydrate moiety of Fc fragment has an 
affinity for boronate derivatives (Fig. 4B). Thus, a surface modified by a boronate derivatives 
or a protein G was effectively applied for oriented immobilisation of a recombinant lectin via a 
fused Fc fragment (78). Although boronate immobilisation approach showed the highest 
sensitivity of detection, presence of boronate functionalities on the chip surface induced non- 
specific interactions with glycoproteins and thus dextran blocking was introduced to minimise 
unwanted glycoprotein interactions. Additional drawback of such approach can be expected 
by introduction of Fc fragment having glycan entities, what can interact with glycan-binding 
proteins, which might be present in complex samples. 


3.2. Perspectives for Peptide Lectin Aptamers 


An alternative to production of mutated forms of lectins or glycosidases (7) for subsequent 
application in glycoprofiling in a future might be a preparation of novel forms of glycan- 
recognising proteins termed here as peptide lectin aptamers (PLA). Such proteins will be 
peptide aptamers with a lectin-like affinity to recognise various forms of glycans. A term 
peptide aptamer was coined by Colas et al. in 1996 (79) and is defined as a combinatorial 
protein molecule having a variable peptide sequence, with an affinity for a given target 
protein, displayed on an inert, constant scaffold protein (80). Construction of various 
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bioanalytical devices such as peptide lectin aptamer microarrays can benefit from such 
biorecognition elements since it would be possible to generate ,army of terracotta soldiers“ 
looking at the molecular level almost identical besides distinct „facial“ feature of each entity 
provided by a unique peptide sequence. Even though lectin peptide aptamers have not been 
prepared yet, in our opinion, it is only a question of time, when such recognition elements will 
be prepared. 


A beneficial feature of such a protein will be high solubility, a small, uniform size of a 
scaffold protein with an extended chemical and thermal stability and a possibility to express 
such proteins in prokaryotic expression systems, what is a cost-effective process (87). 
Moreover, when a small PLA will be immobilised on the surface of various bioanalytical 
devices, a higher density of biorecognition element can enhance sensitivity of detection, 
while suppressing non-specific interactions and lowering background signals (80). Peptide 
aptamers are produced by protein engineering from high-complexity combinatorial libraries 
with appropriate isolation/selection methods (82, 80, 81). Thus, a need to have knowledge of 
the protein structure and the mechanism behind binding is not necessary. There are however 
some requirements for the scaffold protein to posses such as lack of a biological activity and 
ability to accommodate a wide range of peptides without changing a 3-D structure (83). 


Currently there are over 50 proteins described as potential affinity scaffolds, but only 
quite a few of them reached stage behind a proof of the concept phase (81). Scaffold 
proteins were constructed from a diverse range of proteins differing in origin, size, structure, 
engineering protocols, mode of interaction and applicability and typically have from 58 up to 
166 amino acids (Fig. 5) (87, 84). For example peptide aptamers based on a Stefin A protein 
(a cysteine protease inhibitor) are working well in an immobilised state on gold and modified 
gold surfaces with Ky of a peptide aptamer for its analyte down to nM range (85, 86). 
Moreover, the scaffold based on Stefin A can accommodate and tolerate more than one 
peptide insert, what can dramatically widen practical application of such peptide aptamers 
(87). It is possible that in case of lectin peptide aptamers a restricted range of amino acids 
enriched in four amino acids involved in glycan recognition could be possible, a concept 
which was successfully implemented for recognition of a maltose binding protein in the past 
(88). 


3.3. Perspectives for Novel Lectin-like Aptamers 


The name aptamer coming from the Latin expression “aptus” (to fit) and the Greek word 
“meros” (part) was coined in 1990 by Ellington and Szostak in order to introduce artificial 
RNA molecules binding to a small organic dye (89). Aptamers are single-stranded 
oligonucleotides selectively binding small molecules, macromolecules or whole cells, 
generally with a size of 15-60 NTs (i.e. 5-20 kDa) (Fig. 6) (80). Additional advantages of 
aptamers include relative simplicity of a chemical modification (introduction of a biotin or a 
fluorescent label), simple regeneration/reusability, and stability at a high temperature and/or 
at a high salt concentration (90). Due to a small size of aptamers, they can be effectively 
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attached to the interfaces with high densities, a feature important for construction of various 
robust and sensitive bioanalytical devices (91, 92). Either DNA or RNA has to be chosen for 
preparation of aptamers keeping in mind a final application. For example RNA is structurally 
more flexible compared to DNA and thus, theoretically such aptamers can be raised against 
a wider range of analytes (81). Contrary, a major limitation of using RNA is their susceptibility 
to chemical and/or enzymatic degradation. Furthermore, selection of RNA aptamers is a 
time-consuming process requiring additional enzymatic steps. Modifications of the DNA or 
RNA backbone or introduction of modified nucleotides can produce aptamers more resistant 
to degradation (93). 


Fig. 5 near here 


When aptamers interacts with its analyte, usually a conformational change occurs creating 
a specific binding site for the target. Aptamers for proteins generally exhibit quite a high 
affinity in nM or sub nM level due to presence of large complex areas with structures rich in 
hydrogen-bond donors and acceptors (80). A relatively high affinity of aptamer makes such 
oligonucleotides an attractive alternative to lectins or antibodies as detection reagents for 
carbohydrate antigens. In order to increase palette of analytes being recognised by 
aptamers, modified nucleotides were introduced. 


A rational approach for preparation of aptamers with a high affinity binding of 
glycoproteins by an extending library of nucleotides modified by incorporation of a boronic 
acid moiety was recently introduced by Wang's lab (94). The study showed that affinity with 
Ka of 6—17 nM for fibrinogen using boronate modified DNA aptamers was higher compared to 
the affinity with K4 of 64—122 nM for the same analyte using DNA aptamers with natural pool 
of nucleotides. The fact that for the interaction between a glycoprotein and boronate modified 
DNA aptamer it is important an interaction between boronate moiety and glycan of fibrinogen 
was confirmed by analysis of a deglycosylated fibrinogen with a decreased affinity (Ky of 87— 
390 nM) (94). 


An interesting approach for preparation of a wider library of nucleotides was recently 
introduced by incorporation of six new 5-position modified dUTP derivatives with 5 
derivatives containing an aromatic ring (95). DNA aptamers based on an extended pool of 
nucleotides were able to bind a necrosis factor receptor superfamily member 9 (TNFRSF9) 
with a high affinity of Kg=4—6 nM for the first time. Interestingly two new derivatives either 
containing indole derivative or a benzene ring were the best TNFRSF9 binders. This fact is 
quite interesting since TNFRSF¢9 is a glycoprotein (96) and we can only speculate that these 
two aromatic derivatives of dUTP were involved in recognition of TNFRSF9 via a glycan 
interaction. In a recent and similar study a derivative of imidazole (7-(2-thienyl)imidazo[4,5- 
b]pyridine, Ds) containing nucleotides was applied for generation of novel DNA aptamers 
binding to two glycoproteins vascular endothelial cell growth factor-165 (VEGF-165) and 
interferon-y (IFN-y) with an enhanced affinity (97). The study revealed Ky down to 0.65 pM 
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with DNA aptamers based on Ds-nucleotides, while the best Ky of 57 pM for the DNA 
aptamers containing natural nucleotides was found for VEGF-165 (97). Similarly, DNA 
aptamers based on Ds-nucleotides offered much lower Ky of 0.038 nM compared to DNA 
aptamers based on natural nucleotides with Ky of 9.1 nM for IFN-y (97). Here we can again 
only speculate if the role of Ds nucleotides in enhanced affinity for two glycoproteins is in 
interaction of Ds modified nucleotides with the glycan moiety of glycoproteins. 


Fig. 6 near here 


3.4. Other Novel Forms of Lectins 


There are several very interesting strategies how to enhance analytical applicability of 
lectins by their simple modifications, which can dramatically influence the field of 
glycoprofiling in a future. 


The first study focused on application of multimers of eight different lectins prepared by 
incubation of biotinylated lectins with streptavidin. A wheat germ agglutinin (WGA) multimers 
integrated into lectin microarrays showed 4—40 times better sensitivity in analysis of glycans 
in human plasma and much better performance in glycoprofiling of samples from people 
having pancreatic cancer compared to utilisation of WGA lectin monomer (98). Authors of the 
study suggested that such lectin multimers with an enhanced affinity towards glycans can 
broaden the range of glycans, which can be detected. Moreover, according to authors lectin 
multimers might provide a fundamentally new biorecognition information not achievable by 
lectin monomers (98). The second study described attachment of a boronate functionality to 
two different lectins in order to enhance affinity 2 to 60-fold for a particular glycan binding 
(99). Such modified lectins were tested in a whole cell lysate with an excellent specificity for 
analysis of 295 N-linked glycopeptides. These results revealed that application of boronate 
modified lectins can facilitate identification of glycans present on the surface of low-abundant 
glycoproteins (99). The third study indicated that by preparation of a lectin mutant with 
artificially introduced cysteine into lectin Galanthus nivalis agglutinin it was possible to 
prepare lectin dimmers via a disulfide linkage between two lectin mutants (100). 
Agglutination activity of a lectin dimmer increased 16-fold compared to a lectin monomer and 
interestingly a transformation monomer/dimmer can be redox-switchable by addition of mild 
oxidation or reducing agents (100). 


It can be summed up there are very exciting concepts already introduced in the field of 
lectin glycoengineering such as integration of deglycosylated forms of recombinant lectins 
into lectin microarrays with enhanced sensitivity of analysis and with lower detection limits 
achieved. It is only a question of time, when a wider range of recombinant lectins with 
purification/immobilisation tags will be applied for oriented immobilisation of lectins combined 
with various transducers or devices. Application of lectins modified by boronate derivatives or 
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in a form of multimers/dimmers is a promising way for analysis of low abundant glycoproteins 
and possibly for analysis of glycoproteins, which cannot be detected by unmodified lectins. 


We propose a future application of lectin peptide aptamers, what can even further 
enhance overall order of an immobilisation process compared to immobilisation of 
recombinant lectins with different tags for construction of various devices applicable in 
glycoprofiling. Although there is one rational approach for designing DNA aptamers with 
enhanced affinity towards glycoproteins by introduction of a boronate moiety into nucleotides 
there are two other reports focused on enhanced nucleotide alphabet created for generation 
of novel high affinity DNA aptamers against important targets. Interestingly in these two 
studies described especially glycoproteins were the main targets for such novel DNA 
aptamers based on an extended nucleotide alphabet, since such glycoproteins could not be 
recognised by “natural” DNA aptamers consisted of only natural nucleotides. Moreover, 
nucleotides in such novel DNA aptamers were modified mainly by aromatic amino acids, 
which are usually involved in glycan recognition by lectins. Thus, it is necessary to prove ina 
future if such modified nucleotides are really involved in glycan recognition. 


4. A Glycocodon Hypothesis 


A codon (ninz2n3) is a sequence of three DNA or RNA nucleotides that corresponds to a 
specific amino acid or stop signal during protein synthesis, and the full set of codons is called 
a genetic code. The current state-of-the-art knowledge about the origin of the genetic code 
still remains as one of unsolved problems, and enormous number of theories can be divided 
to RNA world theories, protein world theories, co-evolution theories and stereochemical 
theories. Integration of these theories leads us to the conclusion that a system of four codons 
(“gnc”, n= a - adenine, g - guanine, c - cytosine, u - uracil) and four amino acids (G - glycine, 
A - alanine, V - valine, D — aspartic acid) could be the original genetic code (101-105). 
Research on a selection of particular RNA sequences with an amino acid binding activity, 
and a relation of those activities to the genetic code, have revealed an evidence that there is 
a highly robust connection between the genetic code and RNA-amino acid binding affinity 
(106). It seems that the main part of the genetic code is influenced by a stereochemical 
prebiotic selection during the first polymerisation of G, A, V and D amino acids and g, c, u 
and a nucleic acids, however, only first two nucleotides of codons (njn2) are directly related 
to amino acid stereoselectivity (107). 


Recently, a similar evolution process was proposed for "the glycocode” (108). The 
bioinformatics quantification of “GAVD-dipeptides” in monosaccharide specific proteins 
revealed that the amino acid triplets, the glycocodons (aa,aazaa3), can be deduced for each 
glycan letter (monosaccharide). The glycocodons are composed from one polar amino acid, 
interacting with sugar -OH groups, and one specific dipeptide, usually detecting C-C 
hydrophobic patch (see Glic, Gal and GIcNAc binding, Fig. 7a and 7c). Figure 7 depicts a 
quantification spectra of “GAVD-dipeptides” in glucose (Glc), galacose (Gal), mannose 
(Man), fucose (Fuc), N-acetylglucosamine (GIcNAc) and N-acetylgalactosamine (GalNAc) 
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specific proteins. In the case of Glc, Gal and GalNAc the maximal values of incidence of 
“GAVD-dipeptides” were taken for coding; AA for Glc, GA plus AG for Gal, and DD plus VV 
for GalNAc. In the case of Man and GlcNAc GD plus DG and GV plus VG from “GAVD- 
dipeptide pool” were taken for coding, because maximal values have been already taken by 
previous monosaccharides. During evolution, the GAVD-glycocodons were transformed to 
novel glycocodons by a positive selection for the increased diversity and functionality of a 
“sugar-protein language” that can be made with a larger amino acid alphabet. Nevertheless, 
evolution process holds hydropathic similarity; amino acids in the glycocodons are 
substituted by amino acids with similar polar properties, what minimises errors in established 
sugar-protein interactions. The bioinformatics quantification of dipeptides composed from all 
20 amino acids revealed that GA plus AG for Gal were substituted mainly with SW and WS, 
AA for Glc can be substituted with MF, GD plus DG for Man can be substituted with AY plus 
YA, GV plus VG for GIcNAc can be substituted with SF plus FS, and DD plus VV for GalNAc 
can be substituted with QD plus LF. AV plus VA from a “GAVD-dipeptide pool” were selected 
for NAc-group sensing, in the case of GalNAc, they were transformed to MS plus SM and IT 
plus TI dipeptides. Figure 7d shows how the GalNAc glycocodons are used during N- 
glycosylation by bacterial oligosaccharyltransferase (Campylobacter lari, PGIB). PGIB 
accepts different oligosaccharides from a lipid carrier requiring an acetamido group at the C2 
carbon of the first monosaccharide (GalNAc is the best), or even a “monosaccharide” N- 
acetylgalactosamine-diphospho-undecaprenyl is a good substrate (109, 110). PGIB connects 
the C1 carbon in the first saccharide moiety (N-acetylgalactosamine) with the amide nitrogen 
of the acceptor (sequon) asparagine. DQNATF peptide has been recognised as an optimal 
acceptor sequence for PGIB (111). According to the glycocodon theory, DQ dipeptide plus 
the acceptor asparagine makes the glycocodon for GalNAc. The next ATF sequence of the 
sequon peptide is inserted into the catalytic centre in such a way that the KTI and HLF 
glycocodons are formed. This process shows a basic difference between the codons and the 
glycocodons. When the codons are read from the nucleotide sequence, they are read in 
succession and do not overlap with one another. Contrary, the glycocodons are used in a 
way of a “key-lock” principle — three different protein chains can make the glycocodon in 3D 
space and the glycocodons frequently overlap (Fig.7). 


Fig. 7 near here 


It should be emphasised that the glycocodons were theoretically deduced by a 
bioinformatics study and it will be necessary to perform a study in the laboratory to establish 
the strongest correlation between the monosaccharides and the glycocodons and to 
determine the shortest peptides for the recognition of the specific monosaccharide. However, 
the glycocodon theory represents a tool showing how the peptide lectin aptamers or novel 
DNA aptamers based on nucleotide derivatives should be organised and programmed. 
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5. Conclusions 


This chapter described various tools, which have been recently applied in order to extend 
analytical usefulness of lectin based devices in glycoprofiling. A positive aspect of recent 
effort in the field is utilisation of a great potential nanotechnology can bring into quite a 
complex and challenging analysis of glycans. Such approaches proved analysis of glycans 
by different biosensors can be extremely sensitive with a concentration range spanning few 
orders of magnitude, what are features essential in analysis of low-abundant glycoproteins. 
Control of immobilisation of lectins is other important issue, which was successfully 
addressed in a pilot study showing that attachment of recombinant lectins on surfaces via 
various tags present in recombinant lectin improved sensitivity of glycan analysis. The book 
chapter described also future prospect of peptide lectin aptamers in order to increase 
sensitivity and stability of analysis, while suppressing non-specific interactions. Additional 
issue to focus on in a future is investigation if modified nucleotides can be successfully 
applied in preparation of novel DNA aptamers targeting glycoproteins with high affinity and 
selectivity. The final part of the chapter describe the concept of a glycocodon, what can lead 
to a better understanding of glycan-lectin interaction and for design of novel lectins with 
unknown specificities and/or better affinities toward glycan target or for rational design of 
peptide lectin aptamers or DNA aptamers. 
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Figure caption 


Figure 1: A) A mechanical platform of detection based on an array of microcantilever 
biochips with a cantilever bent after a biorecognition took place. B) An electronic detection 
platform of analysis based on an electrochemical impedance spectroscopy (EIS) with an 
increase in overall resistance of a soluble redox probe, represented by a red arrow, to the 
interface after a biomolecular interaction. 


Figure 2: A) A field-effect transistor (FET) sensing based on a changed conductivity of a 
single-walled carbon nanotube (SWCNT) positioned in between a source and a drain. B) An 
optical detection platform based on quenching of an intrinsic fluorescence of a SWCNT by 
Ni-tether employed for lectin immobilisation via Hise tag. A fluorescence of a SWCNT is partly 
restored after biorecognition since Ni-tether is pushed away from the SWCNT surface. 


Figure 3: A localised surface plasmon resonance (LSPR) employed for a label-free 
recognition based on a shift of reflected light as in case of traditional SPR technique. 


Figure 4: A) A schematic representation of possibilities to control uniformity of glycan 
binding proteins with a random amine coupling (upper image), oriented immobilisation via 
introduced purification tag (image in the middle) and immobilisation of uniform lectin peptide 
aptamers differing only in peptide insert providing a biorecognition element; GBS — glycan 
binding site. B) Various ways for immobilisation of Fc-fused lectin on interfaces based on 
boronate affinity towards glycans present in Fc fragment (left), on affinity of protein G 
towards Fc fragment (middle) or a random amine coupling (right). 


Figure 5: Peptide aptamers based on an affibody (58 AA, PDB code 1LP1, on left) or an 
DARPin (166 AA, PDB code 2BKK, on right) scaffold in complex with its analyte. A peptide 
aptamer is in both cases at the bottom part of a figure, while its analyte is above the peptide 
aptamer. 


Figure 6: An RNA aptamer (purple) bound to its analyte peptide (white-magenta chain) (a 
PDB code 1EXY). 


Figure 7: A glycocodon theory - G (glycine), A (alanine), V (valine), and D (aspartic acid) are 
elementary amino acids and the first primordial interactions between GAVD-peptides and 
sugars were evolutionary conserved and used in the glycocodons; the full set of glycocodons 
is proposed to call - the glycocode (108). A) A distribution of GAVD-dipeptides in galactose 


22 Perspectives in glycomics and lectin engineering 


(Gal) and glucose (Glc) specific proteins (108). In human Galectin-3 (a yellow protein 
structure, 1KJL), Gal is sensed by two overlapping glycocodons — NWG plus WGR that are 
derived from ancient GA — AG specific dipeptides; in perchloric acid-soluble protein from 
Pseudomonas syringae (a blue protein structure, 3KOT), Glc is sensed by the RAA 
glycocodon — ancient AA specific dipeptide can be today transformed to MF. B) A distribution 
of GAVD-dipeptides in mannose (Man) and fucose (Fuc) specific proteins (108). In 
bacteriocin from Pseudomonas sp. complexed with Met-mannose (a cyan protein structure, 
3M7J), Man is sensed by two overlapping glycocodons — QGD plus DGN - ancient GD - DG 
specific dipeptides can be transformed today to AY — YA dipeptides; in a PA-IIL lectin from P. 
aeruginosa (an orange protein structure, 2JDK), Fuc is sensed by the specific glycocodon 
NSS and by two GTQ and GTD glycocodons derived from ancient GA dipeptide specific for 
Gal (Fuc is actually 6-deoxy-L-galactose). C) A distribution of GAVD-dipeptides in N- 
acetylglucosamine (GIcNAc) and glucose (Glc) specific proteins (108). In human L-ficolin (an 
assembly of two monomers is shown, 2J30), GIcNAc is sensed by two overlapping 
glycocodons — RVG plus VGE - ancient VG specific dipeptide can be transformed today to 
FS dipeptide. D) A distribution of GAVD-dipeptides in N-acetylgalactosamine (GalNAc) and 
galactose (Gal) specific proteins (108). In oligosaccharyltransferase from Campylobacter lari 
(3RCE), GalNAc (a first moiety of the oligosaccharide) is sensed by two overlapping 
glycocodons — EMI plus ITE derived from ancient VV and VA dipeptides specific for GalNAc. 
GalNAc is transferred to asparagines of the DQN glycocodon - DQ dipeptide is derived from 
ancient DD dipeptide specific for GalNAc. 
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